Image combination/conversion apparatus

ABSTRACT

A subject of the present invention is to form simply a mapping table, which transforms a picked-up image of a real camera into an image viewed from a virtual viewpoint, by a small amount of computation.  
     An image transforming system of the present invention has an imaging means ( 10 ) for picking up a real image, a coordinate recording means ( 32 ) for recording a three-dimensional position on a projection model corresponding to a previously-computed pixel position of a virtual camera, a means for computing a positional relationship between the virtual camera and the imaging means ( 10 ), means ( 20 )( 30 ) for forming and recording a mapping table, which transforms the real image into an image viewed from the virtual camera, based on information of the positional relationship and information of the three-dimensional position recorded in the coordinate recording means ( 32 ), and a means ( 40 ) for mapping the real image by using the mapping table. The means for computing the positional relationship predicts the positional relationship from the installation position, and inputs it into the means ( 20 ) for forming the mapping table as a camera parameter when an installation position of the imaging means ( 10 ) is decided previously.

TECHNICAL FIELD

[0001] The present invention relates to an image synthesizingtransforming system which synthesizes a image picked up by a real camerainto a transformed image picked up by a virtual camera in real time.

BACKGROUND ART

[0002] As the image synthesizing transforming system forsynthesizing/transforming the images picked up by a plurality of realcameras in the related art, there is the system set forth inInternational Publication WO00/64175, for example. This system will beexplained with reference to FIG. 10 hereunder.

[0003] The image synthesizing transforming system in the related art isconfigured to have an imaging means 110 and an image processing portion120. The imaging means 110 includes a plurality of cameras 111, 112 andframe memories 113, 114 corresponding to respective cameras 111, 112.Images input from respective cameras 111, 112 are written into thecorresponding frame memories 113, 114.

[0004] The image processing portion 120 includes an image synthesizingmeans 121, a mapping table looking-up means 122, and a video signalgenerating means 123. The mapping table looking-up m eans 122 includes atransformation address memory 131 for storing transformation addresses(mapping table) indicating correspondences between position coordinatesof output pixels and position coordinates of input pixels, and adegree-of-necessity memory 132 for recording degree-of-necessities ofrespective input pixels at that time.

[0005] The image synthesizing means 121 generates data of the outputpixels by adding data of respective pixels in the frame memories 113,114 according to designated degree-of-necessities, based on thetransformation addresses (mapping table) recorded in the mapping tablelooking-up means 122. The video signal generating means 123 outputs thedata of the output pixels generated by the image synthesizing means 121as an image signal. In this case, above processes are carried out basedon an appropriate synchronizing signal such as an input image signal, orthe like, for example.

[0006] The image synthesizing means 121 implements in real time tosynthesize the images input from two different cameras 111, 112 incompliance with the mapping table looking-up means 122, synthesizesmoothly the input images from a plurality of different cameras bygenerating the output image while changing the pixel positions, andtransform the input images into the image viewed from the virtualviewpoint. However, in order to execute the image synthesis in realtime, it is required that the mapping table used in the image synthesisis previously recorded in the mapping table looking-up means 122.

[0007] Next, procedures of forming the mapping table will be explainedhereunder. In order to form the mapping table, coordinates of the pixelsof respective camera images corresponding to respective pixels of thesynthesized image viewed from the virtual viewpoint (installing positionof the virtual camera) must be decided. Procedures of deciding thiscorrespondence are classified into two phases consisting of a phase inwhich positions of points on a global coordinate system, whichcorrespond to respective pixels of the synthesized image viewed from thevirtual viewpoint, are calculated and a phase in which coordinates ofthe pixels on the real camera, which correspond to calculated positionsof the points on the global coordinate system, are calculated.

[0008] In this case, the relationships recorded finally on the mappingtable are only relationships between respective pixels of thesynthesized image viewed from the virtual viewpoint and pixels ofrespective camera images (real images). The procedures of forming themapping table are not limited to the system that is executed via thepoints on the above global coordinate system. However, the mapping tableformed via the points on the above global coordinate system is excellentin respect of formation of the synthesized image, by which surroundingcircumstances can be correlated easily with actual distances andpositional relationships, since the meaning of the synthesized image onthe global coordinate system as the coordinate system in the real worldcan be made clear.

[0009] A relationship between the pixel position [mi]=(xi,yi) of thevirtual camera and the camera coordinate [Pi]=(Xi,Yi,Zi) of the virtualcamera are defined as follows.

xi=xi/zi (where Zi is not 0)

yi=Yi/Zi (where Zi is not 0)

[0010] The transformation from the camera coordinate [Pi] of the virtualcamera to the global coordinate [Pw] is executed by usingthree-dimensional rotation [Ri] and transformation [Ti] as follows.

[Pw]=[Ri][Pi]+[Ti]

[0011] Similarly, the transformation from the global coordinate [Pw]into the camera coordinate [Pr] of the real camera is executed by usingthree-dimensional rotation [Rr] and transformation [Tr] as follows.

[Pr]=[Rr][Pw]+[Tr]

[0012] The transformation from the camera coordinate system of thevirtual camera to the global coordinate system and the transformationfrom the global coordinate system to the camera coordinate system of thereal camera are schematically shown in FIG. 11. That is, an image Mrepresented by the camera coordinate system C of the virtual camera andan image M′ represented by the camera coordinate system C′ of the realcamera are correlated with each other via a global coordinate system Oof the image.

[0013] Also, the transformation from the camera coordinate [Pr]=(Vxe,Vye, Vze) of the real camera to a two-dimensional coordinate [Mr]=(xr,yr) of the real camera on the viewing screen is executed based on theperspective projection transformation by using a focal length fv asfollows.

xr=(fv/Vze) Vxe

yr=(fv/Vze) Vye

[0014] The position obtained by transforming this coordinate into theunit of the pixel and correcting the position in light of a lensdistortion of the real camera corresponds to a position of the pixel bythe real camera. In order to correct the lens distortion, there are thesystem for utilizing a table in which relationships between a distancefrom a lens center and an amount of correction are recorded, the systemfor approximating by using a mathematical distortion model, etc.

[0015] At this time, since a three-dimensional profile of the subjectexisting on the global coordinate system is unknown, a scale factor λ (λis a real number except 0) of [Pi] becomes indefinite in thetransformation from the pixel position [mi] of the virtual camera to thecamera coordinate [Pi] of the virtual camera. That is, in FIG. 12, allpoints on a straight line l, e.g., a point K and a point Q, areprojected on the same pixel position X (xi, yi). Therefore, one point onthe straight line l is decided by assuming an appropriate projectionmodel as a profile of the object that is viewed from the virtualviewpoint. That is, an intersection point between the projection modeland the straight line l is set as a point on the global coordinatesystem.

[0016] In this case, a Zw=0 plane, etc. on the global coordinate system,for example, may be considered as the projection model. If theappropriate projection model is set in this manner, correspondencesbetween respective pixels [Pi] on the synthesized image viewed from thevirtual point and the pixels [Pr] on the real camera image can becomputed by the above procedures.

[0017] In order to compute these correspondences, a great deal ofcomputation, e.g., coordinate calculation of the points on theprojection model, transformation between the camera coordinate systemand the global coordinate system, computation to decide onto whichcamera the coordinate on the projection model is projected if the numberof cameras is large, etc. is needed.

[0018] Meanwhile, a request for the image having a wide visual field isenhanced as a monitor camera used for the purpose of the monitor, acar-equipped camera used for the purpose of the driving assistance,etc., are spread widely. Therefore, it is requested that an image pickedup by a sole camera using a fisheye lens, or the like on behalf of asuper-wide-angle lens or images picked up by a plurality of cameras aresynthesized/transformed to provide the image which can be viewed as ifsuch image is picked up by one camera. Also, there appear nowadays theapplications in which only a necessary area is extracted from the imagehaving a wide visual field, deformed and displayed, the image istransformed in the pseudo-image picked up by the virtual camera anddisplayed, etc.

[0019] In order to execute such synthesis/transformation of the image byapplying the above related art, a large amount of computation is neededas described above. For this reason, a computing unit having a hugecomputation power is required to execute the computation in real time,and such synthesis/transformation is not practical. As a result, themainstream is the system for recording correspondences between inputimages and output images as a mapping table by executing the computationpreviously and then synthesizing/transforming the image in real timewhile looking up the mapping table.

[0020] In order to utilize the previously-computed mapping table, sincethe mapping table depends on an installation position of the actualcamera, the actual camera must be set exactly at the same position asthe installation position of the camera that was used at the time ofcomputation of the mapping table. However, it is possible to say thatthis approach is not so practical. Also, if the installed position ofthe camera is displaced due to any cause in the course of use after thecamera could be installed exactly, the camera must be restored to theoriginal installation position, and this approach is also not practical.

[0021] It is possible to say that, since it is not practical to adjustphysically the installation position of the actual camera in thismanner, preferably the mapping table should be computed after the camerais installed. In this case, if the mapping table is computed in theinside of the image synthesizing transforming equipment, ahigh-performance computing unit that can execute a huge mount ofcomputation is needed. However, since the high-performance computingunit is not used ordinarily after the mapping table has been computed,it is possible to say that this approach is also not practical.

[0022] Also, if the mapping table is computed by an externalhigh-performance computing unit, the computed mapping table must betransferred to the inside of the image synthesizing transformingequipment from the external unit. For example, if the imagesynthesizing/transforming system is built in the device in the vehicle,or the like, it is not practical to install the dedicated interface,which is used to transfer the mapping table but not used ordinarily, tothe outside of the device.

[0023] Accordingly, it is expected that the device having thepreviously-set external interface is used together. In this case, themapping table needs the data transmission of (number of pixels)×(mappingdata capacity per pixel), and thus a high-speed transferringcircumstance is needed. There is the CAN BUS nowadays as the interfacethat can execute the data transmission of the vehicle. This interfaceintends to transfer the control data and does not intend to transfer thelarge data like the mapping table. Thus, it is impossible to say thatthis approach is practical.

[0024] The object of the present invention is to provide an inexpensiveimage synthesizing transforming system which enables to calculate amapping table without a computing unit of high performance after camerasare installed, and has a wide versatility and needs easy maintenance.

DISCLOSURE OF INVENTION

[0025] In order to attain the above object, an image transforming systemof the present invention provides an image transforming system fortransforming an image input from imaging means, which picks up a realimage, into a virtual image viewed from a predetermined virtualviewpoint to output, which has first storing means for recording acorrespondence between a pixel on the virtual image and a point on aprojection model, and second storing means for recording acorrespondence between the pixel on the virtual image and a pixel on thereal image, wherein the pixel on the real image and the pixel on thevirtual image are mutually correlated via a point on a predeterminedprojection model; inputting means for inputting a positionalrelationship between the imaging means and the virtual viewpoint; andcomputing means for rewriting contents recorded in the second storingmeans based on contents in the first storing means and the positionalrelationship.

[0026] According to this configuration, the mapping table can becomputed not to build in the high-performance computing unit after theimaging means is installed, and image synthesis/transformation can beexecuted with an inexpensive configuration.

[0027] Preferably, the image transforming system further has means forpredicting the positional relationship based on an installation positionof the imaging means, in place of the inputting means, when theinstallation position of the imaging means is previously decided. Also,the image transforming system further has means for obtaining a relativeposition of the imaging means with respect to the point on theprojection model as calibration data and predicting the positionalrelationship based on the calibration data, in place of the inputtingmeans. Further, a positional relationship between the virtual camera andthe imaging means is obtained by using calibration data acquired byexternal calibrating means, in place of the inputting means.

[0028] Accordingly, easiness of the installation of the imagingsynthesizing transforming system means is improved. Also, even whenfitting positions of the imaging means are displaced by any cause, themapping table can be reformed without an external high-performancecomputing unit, and thus the maintenance is facilitated. In addition, ifthe calibrating means is built in, all the processes can be completed bythe internal processing. The processes executed after the installationof the imaging means can be carried out by the image synthesizingtransforming system solely.

[0029] More preferably, the first storing means includes recording meansfor recording aggregation data of points on the projection model in acompressed format by a predictive coding, and decompressing means fordecompressing the aggregation data compressed and recorded by therecording means to restore the aggregation data to an original format.According to this configuration, a necessary memory capacity can bereduced and the data can be decompressed by a small amount ofcomputation.

BRIEF DESCRIPTION OF DRAWINGS

[0030]FIG. 1 is a block diagram of an image synthesizing transformingsystem according to a first embodiment of the present invention;

[0031]FIG. 2 is an explanatory view of a viewing transformation byprojection models used in the image synthesizing transforming systemaccording to the first embodiment of the present invention;

[0032]FIG. 3 is an explanatory view of a correspondence between avirtual camera and a real camera in the mapping table used in the imagesynthesizing transforming system according to the first embodiment ofthe present invention;

[0033]FIG. 4 is a block diagram of a three-dimensional coordinaterecording means used in an image synthesizing transforming systemaccording to a second embodiment of the present invention;

[0034]FIG. 5 is an explanatory view of a predictive coding used in theimage synthesizing transforming system according to the secondembodiment of the present invention;

[0035]FIG. 6 is a block diagram of an image synthesizing transformingsystem according to a third embodiment of the present invention;

[0036]FIG. 7 is a block diagram of a calibrating means used in the imagesynthesizing transforming system according to the third embodiment ofthe present invention;

[0037]FIG. 8 is a view showing an example of a monitor display of thecalibrating means used in the image synthesizing transforming systemaccording to the third embodiment of the present invention;

[0038]FIG. 9 is an explanatory view of a mapping table used in the imagesynthesizing transforming system according to the first embodiment ofthe present invention;

[0039]FIG. 10 is a block diagram of an image synthesizing system in therelated art;

[0040]FIG. 11 is a view of relationships among a camera coordinate of avirtual camera, a camera coordinate of a real camera, and a globalcoordinate in the related art; and

[0041]FIG. 12 is an explanatory view of a perspective projection in therelated art.

[0042] In Figures, a reference numeral 10 is an imaging means, 11, 12are a camera, 13, 14 are a frame memory, 20 is a computing means, 30 isa mapping table looking-up means, 31 is a mapping table recording means,32 is a three-dimensional coordinate recording means, 33 is andecompressing means, 34 is a compressing means, 40 is an imagesynthesizing means, 50 is an image outputting means, 60 is a calibratingmeans, 61 is a controller, 62 is a calibration computing means, 63 is amark superposing means, and 64 is a monitor.

BEST MODE FOR CARRYING OUT THE INVENTION

[0043] Embodiments of the present invention will be explained withreference to the drawings hereinafter.

First Embodiment

[0044]FIG. 1 is a block diagram of an image synthesizing transformingsystem according to a first embodiment of the present invention. Thisimage synthesizing transforming system is configured to have an imagingmeans 10, a computing means 20, a mapping table looking-up means 30, animage synthesizing means 40, and an image outputting means 50.

[0045] The imaging means 10 includes a camera 11 for picking up the realimage, and a frame memory 13 for recording the image picked up by thecamera 11. The imaging means 10 might include a plurality of cameras 11.The imaging means 10 of this embodiment includes a camera 12 and a framememory 14 in addition to the camera 11 and the frame memory 13.

[0046] The computing means 20 computes pixel positions of the inputimage which is necessary for generation of the output image, based on acamera parameter input separately and three-dimensional positions on aprojection model, which correspond to pixel positions of the virtualcamera previously computed with looking up a mapping table by themapping table looking-up means 30. Computed results are recorded in themapping table looking-up means 30 as the mapping table.

[0047] The mapping table looking-up means 30 includes a mapping tablerecording means 31 for recording the mapping table, and athree-dimensional coordinate recording means 32 for recordingthree-dimensional positions on the projection model, which correspond tothe previously-computed pixel positions of the virtual camera describedlater. In this embodiment, the mapping table is computed based onthree-dimensional position data on the projection model recorded in thethree-dimensional coordinate recording means 32 and the installationposition of the real camera, and the computed mapping table is recordedin the mapping table recording means 31.

[0048] The image synthesizing means 40 reads the input imagecorresponding to the pixels of the output image from the imaging means10 by looking up the mapping table looking-up means 30, and generatesthe pixels of the output image. Also, the image outputting means 50generates the output image from the pixels that the image synthesizingmeans 40 generated to output the output image.

[0049] Next, an operation of the above image synthesizing transformingsystem will be explained hereunder. The mapping table used by the imagesynthesizing means 40 depends on the installation position of theimaging means 10 even though the position of the virtual viewpoint isfixed. Therefore, the mapping table must be formed in a stage at whichthe system is used first after the imaging means 10 is installed. Atfirst, procedures of forming the mapping table will be explainedhereunder.

[0050] The three-dimensional position data on the projection modelcorresponding to the previously-computed pixel positions of the virtualcamera are recorded in the three-dimensional coordinate recording means32. The projection model is set to eliminate an infinity caused by theperspective transformation, and is defined by a flat plane, acylindrical plane, etc., for example.

[0051]FIG. 2 is a view showing an example in which two planes of a planeA and a plane B are set as the projection model. In the case of FIG. 2,for example, a coordinate (x1 a, y1 a, z1 a) of a point R1A on the planeA is recorded in the three-dimensional coordinate recording means 32 asa three-dimensional position corresponding to a position (u1, v1) of apixel R1 of the output image, and a coordinate (x2 b, y2 b, z2 b) of apoint R2B on the plane B is recorded in the three-dimensional coordinaterecording means 32 as a three-dimensional position corresponding to aposition (u2, v2) of a pixel R2 of the output image.

[0052] These points on the three-dimensional coordinate system arecomputed as intersection points between a straight line indicating aviewing vector and projection model planes. Therefore, if the projectionmodel is defined by the multi-dimentional polynominal such as a curvedsurface, etc., an amount of computation applied to compute thethree-dimensional coordinate becomes huge. Also, as apparent from FIG.2, if the projection model is defined by a plurality of planes andcurved surfaces, a plurality of candidate points on the projection modelcorresponding to the pixel on the image by the virtual camera arepresent. Therefore, all these plural candidate points are computed asthe intersection points between the straight line indicating the viewingvector and the projection model planes, and thus computations to computethe intersection points are needed as many as the candidate points.

[0053] More particularly, in FIG. 2, the point R1A projected onto theplane A and the point R1B projected onto the plane B are present as thecandidate point corresponding to the point R1, and then the point R1Athat is closer in distance to the virtual camera is selected as acorresponding point among these two candidate points. Similarly, thepoint R2A projected onto the plane A and the point R2B projected ontothe plane B are present as the candidate point corresponding to thepoint R2, and then the point R2B that is closer in distance to thevirtual camera is selected as the corresponding point.

[0054] Here, it depends on the definition of the model which pointshould be selected as the corresponding point from a plurality ofcandidate points. The computations such as distance computation, etc.,for example, are required to contracting a plurality of candidate pointsinto one candidate point. The computation of the candidate point isexecuted with the same procedures of the related art.

[0055] In FIG. 2, the case where one real camera is used is shown.However, as shown in FIG. 3, when a plurality of real cameras are used,points on respective cameras, which correspond to the pixels on theimage of the virtual camera, are computed by applying the same processto respective cameras.

[0056] In FIG. 3, points on the projection model planes corresponding tothree pixels on the virtual camera are shown as R3 a, R3 b, R3 c. Thecoordinates of R3 a, R3 b, R3 c are recorded to the three-dimensionalcoordinate recording means 32 as three-dimensional positions of thecorresponding points on the projection model, which correspond torespective pixels on the virtual camera.

[0057] Here, if positional relationships between the virtual camera andthe real cameras can be predicted previously with predeterminedprecision, it can be computed on which camera the corresponding pointson the projection model have the corresponding point. For example, sincenormally the installation position of the monitor camera, thecar-equipped camera, or the like as the position which permits to pickup the image of the monitoring object, or the like is restricted,positional relationships between the virtual camera and the real camerascan be predicted previously. Therefore, the mapping table can be formedby inputting predicted positional data of the real camera into thecomputing means 20 as the camera parameter and using recorded data inthe three-dimensional coordinate recording means 32.

[0058] Also, since pixel positions of the real cameras with respect tothe corresponding points on the projection model, which correspond tothe pixels on the virtual camera, can be measured easily by thewell-known calibrating means, positional relationships between thevirtual camera and the real cameras can be set by receiving the measureddata. In this case, as a third embodiment described later, an example inwhich a calibrating means is built will be explained.

[0059] In FIG. 3, the point R3 a has the corresponding point only on theimage of the real camera 11, and the point R3 c has the correspondingpoint only on the image of the real camera 12. Also, the point R3 a hasthe corresponding point on both images of the camera 11 and the camera12.

[0060] Under this situation, it is wasteful that the computation isexecuted to compute the corresponding point of the point R3 a withrespect to the camera 12 or compute the corresponding point of the pointR3 c with respect to the camera 11. Therefore, if not only thethree-dimensional coordinates corresponding to the pixels on the imageof the virtual camera but also identifying codes of the camera on thescreen of which the corresponding pixels are present are recorded in thethree-dimensional coordinate recording means 32, for example, theuseless computation is not applied to the real cameras having thepossibility that the corresponding pixels are not present thereon atall, and thus an amount of computation required for the formation of themapping table can be reduced.

[0061] Also, since a degree-of-necessity of respective cameras requiredto compute the pixels of the output image from the pixels of pluralcameras can be previously computed in addition to the identifying codesof the cameras, the degree-of-necessity can be recorded at the sametime. Therefore, computation of the degree-of-necessity can also beomitted. The degree-of-necessity of each camera can be computed based onthe three-dimensional position on the projection model planes, forexample, by normalizing a ratio of reciprocals of the distances torespective real cameras, or the like. The meaning of thedegree-of-necessity will be explained in the operation explanation ofthe image synthesizing means 40.

[0062] In this manner, in the image synthesizing transforming system ofthe present embodiment, if three-dimensional positions of thecorresponding points on the projection model corresponding to the pixelson the virtual camera are computed in advance and then recorded in thethree-dimensional coordinate recording means 32, a huge amount ofcomputation required to compute the three-dimensional positions recordedin the three-dimensional coordinate recording means 32 is not requiredof the computing means 20. Even when the installation positions of thereal cameras are displaced, the computing means 20 can compute themapping table, which corresponds to new installation positions of thereal cameras, at a high speed by using the data in the three-dimensionalcoordinate recording means 32.

[0063] The computing means 20 computes the pixel positions on the realcameras corresponding to the pixel positions of the virtual camera,based on the three-dimensional coordinate obtained by looking up thethree-dimensional coordinate recording means 32 to correspond to thepixel positions of the virtual camera and the camera parameter of thereal cameras being input separately. In the case of FIG. 2, as describedabove the coordinate (x1 a, y1 a, z1 a) of the point R1A on the plane Aas the three-dimensional position corresponding to the position (u1, v1)of the pixel R1 of the output image, and the coordinate (x2 b, y2 b, z2b) of the point R2B on the plane B as the three-dimensional positioncorresponding to the position (u2, v2) of the pixel R2 of the outputimage, for example, are recorded in the three-dimensional coordinaterecording means 32.

[0064] When points of the real cameras onto which these points areprojected are computed by the perspective transformation, the point R1Aand the point R2B are projected onto a point I1 (U1, V1) and a point I2(U2, V2) respectively. The computing means 20 forms the mapping tablebased on the result, and stores the table into the mapping tablerecording means 31.

[0065] In the situation that a plurality of real cameras are present,positional relationships between the virtual camera and the real camerascan be previously predicted with predetermined precision, and thethree-dimensional coordinates corresponding to the pixels on the imageof the virtual camera and the identifying codes of the cameras on thescreen of which corresponding pixels are present are recorded in thethree-dimensional coordinate recording means 32, the computing means 20computes only the pixel positions corresponding to the cameras whoseidentifying codes are recorded.

[0066] The mapping table recording means 31 records the mapping tableindicating the correspondences between the pixels on the virtual cameracomputed by the computing means 20 and the pixels on the real camera.FIG. 9 is an explanatory view of this mapping table. The mapping tablerecording means 31 has a first storing means for storing therelationship between the pixel coordinate position (u, v) of the virtualcamera and the coordinate (x, y, z) on the projection model. Thecomputing means 20 computes the relationship between the coordinate onthe projection model and the pixel coordinate position (U, V) of thereal camera based on the contents stored in the first storing means,then forms the relationship between the pixel coordinate position (u, v)of the virtual camera and the pixel coordinate position (U, V) of thereal camera, and then stores such relationship in a second storing meansas the mapping table. As the case may be, the identifying codes(indicated as “C1” in FIG. 9) of the real camera and adegree-of-necessity of each camera when a plurality of camerascorrespond are recorded in the mapping table.

[0067] Next, an operation executed after the computing means 20 formsthe mapping table by using the recorded data in the three-dimensionalcoordinate recording means 32 and then records the table in the mappingtable recording means 31 will be explained hereunder.

[0068] In the imaging means 10, the images picked up by the camera 11and the camera 12 are recorded in the frame memories 13, 14respectively. The mapping table looking-up means 30 transforms the pixelposition of the output image generated by the image synthesizing means40 into the pixel position of the input image corresponding to thepixel, by looking up the mapping table recorded in the mapping tablerecording means 31. When one pixel position of the output imagecorresponds to a plurality of pixel positions of the input image,degrees-of-necessity of these pixels are also read from the mappingtable.

[0069] The image synthesizing means 40 looks up the mapping tablelooking-up means 30 and reads the pixels of the input imagecorresponding to the pixels of the output image to be generated from theimaging means 10. If the pixel of the output image corresponds only toone pixel of the input image, the value of the input pixel is output tothe image outputting means 50. Also, if no corresponding pixel ispresent, a previously decided value is output to the image outputtingmeans 50.

[0070] If one pixel position of the output image corresponds to aplurality of pixel positions of the input image, these pixel values aresynthesized according to degrees-of-necessity of respective pixels thatare looked up simultaneously with the pixel positions of the inputimage. Simply, these pixel values are added according to the inverseratios of degrees-of-necessity as the pixel values of the output image.The image outputting means 50 generates the output image from the pixelsof the output image generated by the image synthesizing means 40 andoutputs the output image.

[0071] In this manner, in the image synthesizing transforming system ofthe present embodiment, the data recorded previously in thethree-dimensional coordinate recording means are utilized in the mappingtable forming process. Therefore, it is unnecessary to execute a hugeamount of computation required for the computation of thethree-dimensional position at the time of forming the mapping table, andonly the perspective projection transformation and the distortioncorrecting computation are required.

[0072] Accordingly, even though the computing means 20 does not have ahigh-performance computation power, the mapping table can be generatedat a high speed after the imaging means are installed. As a result, theinexpensive image synthesizing transforming system having a wideversatility and needing easy maintenance can be implemented.

Second Embodiment

[0073]FIG. 4 is a block diagram of a three-dimensional coordinaterecording means used in an image synthesizing transforming systemaccording to a second embodiment of the present invention. Since anoverall configuration and an operation of the image synthesizingtransforming system in this embodiment are similar to those in the firstembodiment, their illustration and explanation are omitted herein. Onlya feature portion according to the second embodiment shown in FIG. 4will be explained.

[0074] A three-dimensional coordinate recording means 32 shown in FIG. 4is configured to have a recording means 34 for recording thethree-dimensional position on the projection model corresponding to thepreviously-computed pixel position of the virtual camera in adata-compressed format, and an decompressing means 33 for decompressingthe three-dimensional position recorded in a data-compressed format inthe recording means 34 to restore into the original data.

[0075] An operation of the three-dimensional coordinate recording means32 having such configuration will be explained hereunder. The recordingmeans 34 records the three-dimensional position on the projection modelcorresponding to the previously-computed pixel position of the virtualcamera in a data-compressed format based on a predictive coding.Normally, the projection model is defined as an aggregation of smoothsurfaces such as planes, curved surfaces, etc. The three-dimensionalposition is represented as the intersection point between the projectionmodel plane and a straight line indicating the viewing direction of thevirtual camera.

[0076] Accordingly, the three-dimensional position is also changedrelatively regularly on the projection model plane. Therefore, effectivedata compression can be achieved by the predictive coding. For example,as shown in FIG. 5, a high compression ratio can be attained by a simplecompressing method that utilizes differences between respectivecomponents of the three-dimensional position and the precedingcomponents as predictive values and utilizes differences (predictionerrors) between respective predictive values and respective componentsof the three-dimensional position as compressed data, etc. FIG. 5 showsone dimension, but this method can be decompressed easily to threedimensions. Also, since the predictive coding can restore the originaldata by adding the predictive values and the prediction errors, alimited amount of computation is required to decompress the data andthus the decompressing process can be carried out at a high speed.

[0077] The decompressing means 33 restores into the original data bydecompressing the three-dimensional position data recorded in acompressed format in the recording means 34. At this time, as describedabove, a high computation power is not required for the decompressingmeans 33 that decompresses the predictive coding.

[0078] In this manner, according to the above configuration, the presentembodiment possesses the advantage of capable of recording the data usedto form the mapping table by a small memory capacity not to need a highcomputation power. This is advantageous to the situation that, in orderto make provision against the case where the mapping table must bereformed due to any cause during employment of the image synthesizingtransforming system, the three-dimensional position data on theprojection model corresponding to the pixel position of the virtualcamera should be continued to maintain after the mapping table isformed.

Third Embodiment

[0079]FIG. 6 is a block diagram of an image synthesizing transformingsystem according to a third embodiment of the present invention. Theimage synthesizing transforming system according to this embodiment isconfigured to have the imaging means 10, the computing means 20, themapping table looking-up means 30, the image synthesizing means 40, theimage outputting means 50, and a calibrating means 60. A configurationof the mapping table looking-up means 30 is similar to that in the imagesynthesizing transforming system according to the first embodiment orthe second embodiment.

[0080] The calibrating means 60 generates calibration data of theimaging means 10 by correlating a point on the already-known globalcoordinate system in the screen picked up by the imaging means 10 withthe pixel position on the screen, and then outputs the data to thecomputing means 20 as the camera parameter.

[0081] The computing means 20 computes the pixel position of the inputimage necessary for the generation of the output image, based on thecamera parameter computed by the calibrating means 60 and thethree-dimensional position on the projection model corresponding to thepreviously-computed pixel position of the virtual camera obtained bylooking up the mapping table looking-up means 30. The computed result isrecorded in the mapping table looking-up means 30 as the mapping table,

[0082] The image synthesizing means 40 reads the input imagecorresponding to the pixel of the output image from the imaging means 10by looking up the mapping table looking-up means 30, and generates thepixel of the output image. The image outputting means 50 generates theoutput image from the pixel generated by the image synthesizing means40, and outputs such output image.

[0083] Since a basic operation of the image synthesizing transformingsystem having the above configuration is similar to that of the firstembodiment, its explanation will be omitted herein. An operation of thecalibrating means 60 that is different from the first embodiment will beexplained only hereunder.

[0084]FIG. 7 is a configurative block diagram of the calibrating means60. The calibrating means 60 is configured to have a mark superposingmeans 63 for superposing a mark for alignment on the input image, amonitor 64 for displaying the image on which the mark is superposed, acontroller 61 for indicating a display position of the mark, and acalibration computing means 62 for computing the camera parameter basedon the position of the mark and the coordinate of the point on thealready-known global coordinate system, for example.

[0085]FIG. 8 is a view showing a transition example of the displayscreen of the monitor 64 in FIG. 7. Alignment marks A, B are superposedon the input image by the mark superposing means 63 and then displayedon the monitor 64. Display positions of the alignment marks A, B arecontrolled by the controller 61, and the controller 61 informs the marksuperposing means 63 of the display positions. In the example shown inthe figure, since a plurality of alignment marks A, B are present,respective numbers of the marks A, B are also informed.

[0086] An alignment target C whose coordinate on the global coordinatesystem has already been known is illustrated in the input image. At thistime, as shown in the monitor display (upper) before alignment in FIG.8, when the marks A, B are displaced from the target C and displayed,the display positions of the alignment marks A, B are moved by theoperation of the controller 61 to mate with predetermined point of thealignment target C. Thus, the monitor display state shown in the lowerportion in FIG. 8 is brought about.

[0087] In the state shown in the lower portion in FIG. 8, the controller61 informs the calibration computing means 62 of the end of the markalignment and also informs the display positions of the marks A, B atthat time. The calibration computing means 62 executes a calibration ofthe imaging means 10 that picked up the input image, based on thecorrespondence between the positions of the marks A, B informed by thecontroller 61 and the already-known coordinate on the global coordinatesystem, and then outputs the resultant camera parameter to the computingmeans 20.

[0088] In this example in FIG. 8, the number of points necessary for thealignment is set two. The number of points necessary for the alignmentdepends upon the number of variables that are required for thecalibration of the imaging means 10.

[0089] As described above, according to the present embodiment, all theprocesses executed after the installation of the imaging means can becarried out by the image synthesizing transforming system solely.Therefore, the installation positions of the imaging means are in no waylimited, and the installation of the imaging means is facilitated. Also,even when fitting positions of the imaging means are displaced by anycause, the mapping table can be reformed without an externalhigh-performance computing unit, and thus easiness of the maintenance isimproved.

[0090] The present invention is explained in detail with reference toparticular embodiments. It is apparent for the person skilled in the artthat various variations and modifications can be applied withoutdeparting from a spirit and a scope of the present invention.

[0091] This application is based on Japanese Patent Application No.2002-57440 filed on Mar. 4, 2002, and the contents thereof isincorporated hereinto by the reference.

Industrial Applicability

[0092] According to the present invention, the synthesis/transformationof the image can be implemented with an inexpensive configuration sincea mapping table can be calculated not to built in a computing unit ofhigh performance after imaging means are installed.

1. An image transforming system for transforming an image input fromimaging means, which picks up a real image, into a virtual image viewedfrom a predetermined virtual viewpoint to output, comprising: firststoring means for recording a correspondence between a pixel on thevirtual image and a point on a projection model, and a second storingmeans for recording a correspondence between the pixel on the virtualimage and a pixel on the real image, wherein the pixel on the real imageand the pixel on the virtual image are mutually correlated via a pointon a predetermined projection model; inputting means for inputting apositional relationship between the imaging means and the virtualviewpoint; and computing means for rewriting contents recorded in thesecond storing means based on contents in the first storing means andthe positional relationship.
 2. The image transforming system accordingto claim 1, further comprising: means for predicting the positionalrelationship based on an installation position of the imaging means, inplace of the inputting means, when the installation position of theimaging means is previously decided.
 3. The image transforming systemaccording to claim 1, further comprising: means for obtaining a relativeposition of the imaging means with respect to the point on theprojection model as calibration data and predicting the positionalrelationship based on the calibration data, in place of the inputtingmeans.
 4. The image transforming system according to claim 1, wherein apositional relationship between the virtual camera and the imaging meansis obtained by using calibration data acquired by external calibratingmeans, in place of the inputting means.
 5. The image transforming systemaccording to any one of claims 1 to 4, wherein the first storing meansincludes: recording means for recording aggregation data of points onthe projection model in a compressed format by a predictive coding; anddecompressing means for decompressing the aggregation data compressedand recorded by the recording means to restore the aggregation data toan original format.